linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] sched: push rt tasks only if newly activated tasks have been added
@ 2008-04-21 15:42 Gregory Haskins
  2008-04-21 16:17 ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Gregory Haskins @ 2008-04-21 15:42 UTC (permalink / raw)
  To: suresh.b.siddha
  Cc: mingo, rostedt, chinang.ma, arjan, willy, ghaskins, linux-kernel,
	linux-rt-users

Hi Ingo, et. al.,
 Steven asked me to comment more thoroughly for posterity, so here is v2.

Changes since v1:

1) Updated header to clarify regression as a perf. regression.
2) Added inline comments to explain nature of change more clearly.
3) Added a further optimization to post_schedule_rt() to avoid retaking the
   expensive rq-lock if we dont need to push.

Regards,
-Greg

----------------------------------------------
sched: push rt tasks only if newly activated tasks have been added

SCHED_RR can context-switch many times without having changed the run-queue.
Therefore trying to push on each context switch can just be wasted effort
since if it failed the first time, it will likely fail any subsequent times
as well.  Instead, set a flag when we have successfully pushed as many tasks
away as possible, and only clear it when the runqueue adds new tasks
(effectively becoming a run-queue "dirty bit").  If new tasks are added we
should try again.  If any remote run-queues downgrade their priority in the
meantime, they will try to pull from us (as they always do).

This attempts to address a performance regression reported by Suresh Siddha,
et. al. in the 2.6.25 series.  It applies to 2.6.25.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
CC: suresh.b.siddha@intel.com
CC: mingo@elte.hu
CC: rostedt@goodmis.org
CC: chinang.ma@intel.com
CC: arjan@linux.intel.com
CC: willy@linux.intel.com
---

 kernel/sched.c    |    2 ++
 kernel/sched_rt.c |   22 ++++++++++++++++++++--
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 8dcdec6..806881b 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -331,6 +331,7 @@ struct rt_rq {
 #ifdef CONFIG_SMP
 	unsigned long rt_nr_migratory;
 	int overloaded;
+	int pushed;
 #endif
 	int rt_throttled;
 	u64 rt_time;
@@ -7142,6 +7143,7 @@ static void init_rt_rq(struct rt_rq *rt_rq, struct rq *rq)
 #ifdef CONFIG_SMP
 	rt_rq->rt_nr_migratory = 0;
 	rt_rq->overloaded = 0;
+	rt_rq->pushed = 0;
 #endif
 
 	rt_rq->rt_time = 0;
diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index 0a6d2e5..5007723 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -393,6 +393,13 @@ static void enqueue_task_rt(struct rq *rq, struct task_struct *p, int wakeup)
 	 */
 	for_each_sched_rt_entity(rt_se)
 		enqueue_rt_entity(rt_se);
+
+	/*
+	 * Clear the "pushed" state since we have changed the run-queue and
+	 * may need to migrate some of these tasks (see push_rt_tasks() for
+	 * details)
+	 */
+	rq->rt.pushed = 0;
 }
 
 static void dequeue_task_rt(struct rq *rq, struct task_struct *p, int sleep)
@@ -789,7 +796,7 @@ static int push_rt_task(struct rq *rq)
 	int ret = 0;
 	int paranoid = RT_MAX_TRIES;
 
-	if (!rq->rt.overloaded)
+	if (!rq->rt.overloaded || rq->rt.pushed)
 		return 0;
 
 	next_task = pick_next_highest_task_rt(rq, -1);
@@ -849,6 +856,15 @@ out:
 }
 
 /*
+ * push_rt_tasks()
+ *
+ * Push as many tasks away as possible.  We only need to try once whenever
+ * one or more new tasks are added to our runqueue.  After an inital attempt,
+ * further changes in remote runqueue state will be accounted for with pull
+ * operations.  Therefore, we mark the run-queue as "pushed" here, and clear it
+ * during enqueue.  This primarily helps SCHED_RR tasks which will tend to
+ * have a higher context-switch to enqueue ratio.
+ *
  * TODO: Currently we just use the second highest prio task on
  *       the queue, and stop when it can't migrate (or there's
  *       no more RT tasks).  There may be a case where a lower
@@ -863,6 +879,8 @@ static void push_rt_tasks(struct rq *rq)
 	/* push_rt_task will return true if it moved an RT */
 	while (push_rt_task(rq))
 		;
+
+	rq->rt.pushed = 1;
 }
 
 static int pull_rt_task(struct rq *this_rq)
@@ -967,7 +985,7 @@ static void post_schedule_rt(struct rq *rq)
 	 * the lock was owned by prev, we need to release it
 	 * first via finish_lock_switch and then reaquire it here.
 	 */
-	if (unlikely(rq->rt.overloaded)) {
+	if (unlikely(rq->rt.overloaded && !rq->rt.pushed)) {
 		spin_lock_irq(&rq->lock);
 		push_rt_tasks(rq);
 		spin_unlock_irq(&rq->lock);


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] sched: push rt tasks only if newly activated tasks have been added
  2008-04-21 15:42 [PATCH v2] sched: push rt tasks only if newly activated tasks have been added Gregory Haskins
@ 2008-04-21 16:17 ` Ingo Molnar
  2008-04-21 16:20   ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2008-04-21 16:17 UTC (permalink / raw)
  To: Gregory Haskins
  Cc: suresh.b.siddha, rostedt, chinang.ma, arjan, willy, linux-kernel,
	linux-rt-users


* Gregory Haskins <ghaskins@novell.com> wrote:

> Hi Ingo, et. al.,
>  Steven asked me to comment more thoroughly for posterity, so here is v2.

thanks Gregory, applied.

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] sched: push rt tasks only if newly activated tasks have been added
  2008-04-21 16:17 ` Ingo Molnar
@ 2008-04-21 16:20   ` Ingo Molnar
  2008-04-21 16:25     ` Steven Rostedt
  2008-04-21 16:34     ` Gregory Haskins
  0 siblings, 2 replies; 6+ messages in thread
From: Ingo Molnar @ 2008-04-21 16:20 UTC (permalink / raw)
  To: Gregory Haskins
  Cc: suresh.b.siddha, rostedt, chinang.ma, arjan, willy, linux-kernel,
	linux-rt-users


got a build failure:

 In file included from kernel/sched.c:1972:
 kernel/sched_rt.c: In function 'enqueue_task_rt':
 kernel/sched_rt.c:522: error: 'struct rt_rq' has no member named 'pushed'

with this config:

 http://redhat.com/~mingo/misc/config-Mon_Apr_21_18_02_45_CEST_2008.bad

with the patch below against sched-devel/latest.

	Ingo

----------->
Subject: sched: push rt tasks only if newly activated tasks have been added
From: Gregory Haskins <ghaskins@novell.com>
Date: Mon, 21 Apr 2008 11:42:18 -0400

SCHED_RR can context-switch many times without having changed the
run-queue. Therefore trying to push on each context switch can just be
wasted effort since if it failed the first time, it will likely fail any
subsequent times as well.  Instead, set a flag when we have successfully
pushed as many tasks away as possible, and only clear it when the
runqueue adds new tasks (effectively becoming a run-queue "dirty bit").
If new tasks are added we should try again.  If any remote run-queues
downgrade their priority in the meantime, they will try to pull from us
(as they always do).

This attempts to address a performance regression reported by Suresh
Siddha, et. al. in the 2.6.25 series.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
CC: suresh.b.siddha@intel.com
CC: mingo@elte.hu
CC: rostedt@goodmis.org
CC: chinang.ma@intel.com
CC: arjan@linux.intel.com
CC: willy@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/sched.c    |    2 ++
 kernel/sched_rt.c |   22 ++++++++++++++++++++--
 2 files changed, 22 insertions(+), 2 deletions(-)

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -462,6 +462,7 @@ struct rt_rq {
 #ifdef CONFIG_SMP
 	unsigned long rt_nr_migratory;
 	int overloaded;
+	int pushed;
 #endif
 	int rt_throttled;
 	u64 rt_time;
@@ -8073,6 +8074,7 @@ static void init_rt_rq(struct rt_rq *rt_
 #ifdef CONFIG_SMP
 	rt_rq->rt_nr_migratory = 0;
 	rt_rq->overloaded = 0;
+	rt_rq->pushed = 0;
 #endif
 
 	rt_rq->rt_time = 0;
Index: linux/kernel/sched_rt.c
===================================================================
--- linux.orig/kernel/sched_rt.c
+++ linux/kernel/sched_rt.c
@@ -514,6 +514,13 @@ static void enqueue_task_rt(struct rq *r
 	for_each_sched_rt_entity(rt_se)
 		enqueue_rt_entity(rt_se);
 
+	/*
+	 * Clear the "pushed" state since we have changed the run-queue and
+	 * may need to migrate some of these tasks (see push_rt_tasks() for
+	 * details)
+	 */
+	rq->rt.pushed = 0;
+
 	inc_cpu_load(rq, p->se.load.weight);
 }
 
@@ -913,7 +920,7 @@ static int push_rt_task(struct rq *rq)
 	int ret = 0;
 	int paranoid = RT_MAX_TRIES;
 
-	if (!rq->rt.overloaded)
+	if (!rq->rt.overloaded || rq->rt.pushed)
 		return 0;
 
 	next_task = pick_next_highest_task_rt(rq, -1);
@@ -973,6 +980,15 @@ out:
 }
 
 /*
+ * push_rt_tasks()
+ *
+ * Push as many tasks away as possible.  We only need to try once whenever
+ * one or more new tasks are added to our runqueue.  After an inital attempt,
+ * further changes in remote runqueue state will be accounted for with pull
+ * operations.  Therefore, we mark the run-queue as "pushed" here, and clear it
+ * during enqueue.  This primarily helps SCHED_RR tasks which will tend to
+ * have a higher context-switch to enqueue ratio.
+ *
  * TODO: Currently we just use the second highest prio task on
  *       the queue, and stop when it can't migrate (or there's
  *       no more RT tasks).  There may be a case where a lower
@@ -987,6 +1003,8 @@ static void push_rt_tasks(struct rq *rq)
 	/* push_rt_task will return true if it moved an RT */
 	while (push_rt_task(rq))
 		;
+
+	rq->rt.pushed = 1;
 }
 
 static int pull_rt_task(struct rq *this_rq)
@@ -1091,7 +1109,7 @@ static void post_schedule_rt(struct rq *
 	 * the lock was owned by prev, we need to release it
 	 * first via finish_lock_switch and then reaquire it here.
 	 */
-	if (unlikely(rq->rt.overloaded)) {
+	if (unlikely(rq->rt.overloaded && !rq->rt.pushed)) {
 		spin_lock_irq(&rq->lock);
 		push_rt_tasks(rq);
 		spin_unlock_irq(&rq->lock);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] sched: push rt tasks only if newly activated tasks have been added
  2008-04-21 16:20   ` Ingo Molnar
@ 2008-04-21 16:25     ` Steven Rostedt
  2008-04-21 16:36       ` [PATCH v2] sched: push rt tasks only if newly activated taskshave " Gregory Haskins
  2008-04-21 16:34     ` Gregory Haskins
  1 sibling, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2008-04-21 16:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Gregory Haskins, suresh.b.siddha, chinang.ma, arjan, willy,
	linux-kernel, linux-rt-users


On Mon, 21 Apr 2008, Ingo Molnar wrote:

>
> got a build failure:
>
>  In file included from kernel/sched.c:1972:
>  kernel/sched_rt.c: In function 'enqueue_task_rt':
>  kernel/sched_rt.c:522: error: 'struct rt_rq' has no member named 'pushed'
>
> with this config:
>
>  http://redhat.com/~mingo/misc/config-Mon_Apr_21_18_02_45_CEST_2008.bad
>
> with the patch below against sched-devel/latest.
>


> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  kernel/sched.c    |    2 ++
>  kernel/sched_rt.c |   22 ++++++++++++++++++++--
>  2 files changed, 22 insertions(+), 2 deletions(-)
>
> Index: linux/kernel/sched.c
> ===================================================================
> --- linux.orig/kernel/sched.c
> +++ linux/kernel/sched.c
> @@ -462,6 +462,7 @@ struct rt_rq {
>  #ifdef CONFIG_SMP
>  	unsigned long rt_nr_migratory;
>  	int overloaded;
> +	int pushed;
>  #endif
>  	int rt_throttled;
>  	u64 rt_time;
> @@ -8073,6 +8074,7 @@ static void init_rt_rq(struct rt_rq *rt_
>  #ifdef CONFIG_SMP
>  	rt_rq->rt_nr_migratory = 0;
>  	rt_rq->overloaded = 0;
> +	rt_rq->pushed = 0;
>  #endif
>
>  	rt_rq->rt_time = 0;
> Index: linux/kernel/sched_rt.c
> ===================================================================
> --- linux.orig/kernel/sched_rt.c
> +++ linux/kernel/sched_rt.c
> @@ -514,6 +514,13 @@ static void enqueue_task_rt(struct rq *r
>  	for_each_sched_rt_entity(rt_se)
>  		enqueue_rt_entity(rt_se);
>
> +	/*
> +	 * Clear the "pushed" state since we have changed the run-queue and
> +	 * may need to migrate some of these tasks (see push_rt_tasks() for
> +	 * details)
> +	 */
> +	rq->rt.pushed = 0;
> +

I'm betting that this needs a CONFIG_SMP around it.

-- Steve

>  	inc_cpu_load(rq, p->se.load.weight);
>  }
>
> @@ -913,7 +920,7 @@ static int push_rt_task(struct rq *rq)
>  	int ret = 0;
>  	int paranoid = RT_MAX_TRIES;
>
> -	if (!rq->rt.overloaded)
> +	if (!rq->rt.overloaded || rq->rt.pushed)
>  		return 0;
>
>  	next_task = pick_next_highest_task_rt(rq, -1);
> @@ -973,6 +980,15 @@ out:
>  }
>
>  /*

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] sched: push rt tasks only if newly activated taskshave been added
  2008-04-21 16:20   ` Ingo Molnar
  2008-04-21 16:25     ` Steven Rostedt
@ 2008-04-21 16:34     ` Gregory Haskins
  1 sibling, 0 replies; 6+ messages in thread
From: Gregory Haskins @ 2008-04-21 16:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: rostedt, chinang.ma, suresh.b.siddha, arjan, willy, linux-kernel,
	linux-rt-users

>>> On Mon, Apr 21, 2008 at 12:20 PM, in message <20080421162032.GA12645@elte.hu>,
Ingo Molnar <mingo@elte.hu> wrote: 

> got a build failure:
> 
>  In file included from kernel/sched.c:1972:
>  kernel/sched_rt.c: In function 'enqueue_task_rt':
>  kernel/sched_rt.c:522: error: 'struct rt_rq' has no member named 'pushed'
> 
> with this config:
> 
>  http://redhat.com/~mingo/misc/config-Mon_Apr_21_18_02_45_CEST_2008.bad
> 
> with the patch below against sched-devel/latest.

Ah, sorry.  I had written it against 2.6.25 so I suspect a fuzz issue.   I will apply to sched-devel and resolve the errors.

> 
> 	Ingo
> 
> ----------->
> Subject: sched: push rt tasks only if newly activated tasks have been added
> From: Gregory Haskins <ghaskins@novell.com>
> Date: Mon, 21 Apr 2008 11:42:18 -0400
> 
> SCHED_RR can context-switch many times without having changed the
> run-queue. Therefore trying to push on each context switch can just be
> wasted effort since if it failed the first time, it will likely fail any
> subsequent times as well.  Instead, set a flag when we have successfully
> pushed as many tasks away as possible, and only clear it when the
> runqueue adds new tasks (effectively becoming a run-queue "dirty bit").
> If new tasks are added we should try again.  If any remote run-queues
> downgrade their priority in the meantime, they will try to pull from us
> (as they always do).
> 
> This attempts to address a performance regression reported by Suresh
> Siddha, et. al. in the 2.6.25 series.
> 
> Signed-off-by: Gregory Haskins <ghaskins@novell.com>
> CC: suresh.b.siddha@intel.com
> CC: mingo@elte.hu
> CC: rostedt@goodmis.org
> CC: chinang.ma@intel.com
> CC: arjan@linux.intel.com
> CC: willy@linux.intel.com
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  kernel/sched.c    |    2 ++
>  kernel/sched_rt.c |   22 ++++++++++++++++++++--
>  2 files changed, 22 insertions(+), 2 deletions(-)
> 
> Index: linux/kernel/sched.c
> ===================================================================
> --- linux.orig/kernel/sched.c
> +++ linux/kernel/sched.c
> @@ -462,6 +462,7 @@ struct rt_rq {
>  #ifdef CONFIG_SMP
>  	unsigned long rt_nr_migratory;
>  	int overloaded;
> +	int pushed;
>  #endif
>  	int rt_throttled;
>  	u64 rt_time;
> @@ -8073,6 +8074,7 @@ static void init_rt_rq(struct rt_rq *rt_
>  #ifdef CONFIG_SMP
>  	rt_rq->rt_nr_migratory = 0;
>  	rt_rq->overloaded = 0;
> +	rt_rq->pushed = 0;
>  #endif
>  
>  	rt_rq->rt_time = 0;
> Index: linux/kernel/sched_rt.c
> ===================================================================
> --- linux.orig/kernel/sched_rt.c
> +++ linux/kernel/sched_rt.c
> @@ -514,6 +514,13 @@ static void enqueue_task_rt(struct rq *r
>  	for_each_sched_rt_entity(rt_se)
>  		enqueue_rt_entity(rt_se);
>  
> +	/*
> +	 * Clear the "pushed" state since we have changed the run-queue and
> +	 * may need to migrate some of these tasks (see push_rt_tasks() for
> +	 * details)
> +	 */
> +	rq->rt.pushed = 0;
> +
>  	inc_cpu_load(rq, p->se.load.weight);
>  }
>  
> @@ -913,7 +920,7 @@ static int push_rt_task(struct rq *rq)
>  	int ret = 0;
>  	int paranoid = RT_MAX_TRIES;
>  
> -	if (!rq->rt.overloaded)
> +	if (!rq->rt.overloaded || rq->rt.pushed)
>  		return 0;
>  
>  	next_task = pick_next_highest_task_rt(rq, -1);
> @@ -973,6 +980,15 @@ out:
>  }
>  
>  /*
> + * push_rt_tasks()
> + *
> + * Push as many tasks away as possible.  We only need to try once whenever
> + * one or more new tasks are added to our runqueue.  After an inital 
> attempt,
> + * further changes in remote runqueue state will be accounted for with pull
> + * operations.  Therefore, we mark the run-queue as "pushed" here, and clear 
> it
> + * during enqueue.  This primarily helps SCHED_RR tasks which will tend to
> + * have a higher context-switch to enqueue ratio.
> + *
>   * TODO: Currently we just use the second highest prio task on
>   *       the queue, and stop when it can't migrate (or there's
>   *       no more RT tasks).  There may be a case where a lower
> @@ -987,6 +1003,8 @@ static void push_rt_tasks(struct rq *rq)
>  	/* push_rt_task will return true if it moved an RT */
>  	while (push_rt_task(rq))
>  		;
> +
> +	rq->rt.pushed = 1;
>  }
>  
>  static int pull_rt_task(struct rq *this_rq)
> @@ -1091,7 +1109,7 @@ static void post_schedule_rt(struct rq *
>  	 * the lock was owned by prev, we need to release it
>  	 * first via finish_lock_switch and then reaquire it here.
>  	 */
> -	if (unlikely(rq->rt.overloaded)) {
> +	if (unlikely(rq->rt.overloaded && !rq->rt.pushed)) {
>  		spin_lock_irq(&rq->lock);
>  		push_rt_tasks(rq);
>  		spin_unlock_irq(&rq->lock);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] sched: push rt tasks only if newly activated taskshave been added
  2008-04-21 16:25     ` Steven Rostedt
@ 2008-04-21 16:36       ` Gregory Haskins
  0 siblings, 0 replies; 6+ messages in thread
From: Gregory Haskins @ 2008-04-21 16:36 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: chinang.ma, suresh.b.siddha, arjan, willy, linux-kernel,
	linux-rt-users

>>> On Mon, Apr 21, 2008 at 12:25 PM, in message
<Pine.LNX.4.58.0804211224380.18530@gandalf.stny.rr.com>, Steven Rostedt
<rostedt@goodmis.org> wrote: 


> 
> I'm betting that this needs a CONFIG_SMP around it.

Indeed, thanks!  Will send update against sched-devel with this fix.


> 
> -- Steve
> 
>>  	inc_cpu_load(rq, p->se.load.weight);
>>  }
>>
>> @@ -913,7 +920,7 @@ static int push_rt_task(struct rq *rq)
>>  	int ret = 0;
>>  	int paranoid = RT_MAX_TRIES;
>>
>> -	if (!rq->rt.overloaded)
>> +	if (!rq->rt.overloaded || rq->rt.pushed)
>>  		return 0;
>>
>>  	next_task = pick_next_highest_task_rt(rq, -1);
>> @@ -973,6 +980,15 @@ out:
>>  }
>>
>>  /*
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-04-21 16:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-21 15:42 [PATCH v2] sched: push rt tasks only if newly activated tasks have been added Gregory Haskins
2008-04-21 16:17 ` Ingo Molnar
2008-04-21 16:20   ` Ingo Molnar
2008-04-21 16:25     ` Steven Rostedt
2008-04-21 16:36       ` [PATCH v2] sched: push rt tasks only if newly activated taskshave " Gregory Haskins
2008-04-21 16:34     ` Gregory Haskins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).