public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC] sched_tick with interrupts enabled
@ 2006-10-18 17:04 Christoph Lameter
  2006-10-18 17:21 ` Nick Piggin
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Lameter @ 2006-10-18 17:04 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Siddha, Suresh B, Ingo Molnar, Nick Piggin, Peter Williams,
	linux-kernel

scheduler_tick() has the potential of running for some time if f.e.
sched_domains for a system with 1024 processors have to be balanced.
We currently do all of that with interrupts disabled. So we may be unable
to service interrupts for some time.

I wonder if it would be possible to put the sched_tick() into a tasklet and
allow interrupts to be enabled? Preemption is still disabled and so we
are stuck on a cpu.

This can only work if we are sure that no scheduler activity is performed
from interrupt code or from other tasklets that may potentially run.

sched_tick() takes the lock on a request queue without disabling interrupts.
So any other attempt to invoke scheduler functions from interrupts or
tasklets will cause a deadlock.

I tested this patch with AIM7 and things seem to be fine.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

Index: linux-2.6.19-rc2-mm1/kernel/timer.c
===================================================================
--- linux-2.6.19-rc2-mm1.orig/kernel/timer.c	2006-10-18 11:52:53.160984800 -0500
+++ linux-2.6.19-rc2-mm1/kernel/timer.c	2006-10-18 11:53:10.039399811 -0500
@@ -1210,8 +1210,14 @@ static void update_wall_time(void)
 	}
 }
 
+static void sched_tick_action(unsigned long dummy)
+{
+	scheduler_tick();
+}
+
+static DECLARE_TASKLET(scheduler_tick_tasklet, sched_tick_action, 0);
 /*
- * Called from the timer interrupt handler to charge one tick to the current 
+ * Called from the timer interrupt handler to charge one tick to the current
  * process.  user_tick is 1 if the tick is user time, 0 for system.
  */
 void update_process_times(int user_tick)
@@ -1227,7 +1233,7 @@ void update_process_times(int user_tick)
 	run_local_timers();
 	if (rcu_pending(cpu))
 		rcu_check_callbacks(cpu, user_tick);
-	scheduler_tick();
+	tasklet_schedule(&scheduler_tick_tasklet);
  	run_posix_cpu_timers(p);
 }
 
Index: linux-2.6.19-rc2-mm1/kernel/sched.c
===================================================================
--- linux-2.6.19-rc2-mm1.orig/kernel/sched.c	2006-10-18 11:52:51.768286047 -0500
+++ linux-2.6.19-rc2-mm1/kernel/sched.c	2006-10-18 11:53:10.003263868 -0500
@@ -1608,9 +1608,10 @@ void fastcall sched_fork(struct task_str
 		 * runqueue lock is not a problem.
 		 */
 		current->time_slice = 1;
+		local_irq_enable();
 		scheduler_tick();
-	}
-	local_irq_enable();
+	} else
+		local_irq_enable();
 	put_cpu();
 }
 
@@ -3040,7 +3041,8 @@ void account_steal_time(struct task_stru
 
 /*
  * This function gets called by the timer code, with HZ frequency.
- * We call it with interrupts disabled.
+ * We call it with interrupts enabled. However, the thread is
+ * pinned to a specific cpu.
  *
  * It also gets called by the fork code, when changing the parent's
  * timeslices.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] sched_tick with interrupts enabled
  2006-10-18 17:04 [RFC] sched_tick with interrupts enabled Christoph Lameter
@ 2006-10-18 17:21 ` Nick Piggin
  2006-10-18 18:01   ` Christoph Lameter
  0 siblings, 1 reply; 11+ messages in thread
From: Nick Piggin @ 2006-10-18 17:21 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Siddha, Suresh B, Ingo Molnar, Peter Williams, linux-kernel

Christoph Lameter wrote:
> scheduler_tick() has the potential of running for some time if f.e.
> sched_domains for a system with 1024 processors have to be balanced.
> We currently do all of that with interrupts disabled. So we may be unable
> to service interrupts for some time.
> 
> I wonder if it would be possible to put the sched_tick() into a tasklet and
> allow interrupts to be enabled? Preemption is still disabled and so we
> are stuck on a cpu.

I don't think so because it also does accounting which probably wants to
be precisely on a tick.

Also the timeslice accounting takes the rq lock without disabling interrupt,
and task wakeups can easily happen from interrupt / softirq.

Taking rebalance_tick out of scheduler_tick, and not calling rebalance_tick
from sched_fork is probably a good idea.

After that, it might be acceptable to call rebalance_tick from a tasklet,
although it would be uneeded overhead on small systems. It might be better
to have a special case for your large systems which does the full balance
and runs less frequently in a tasklet (and make your regular rebalance_tick
skip the top level balancing).

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] sched_tick with interrupts enabled
  2006-10-18 17:21 ` Nick Piggin
@ 2006-10-18 18:01   ` Christoph Lameter
  2006-10-18 18:09     ` Nick Piggin
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Lameter @ 2006-10-18 18:01 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Siddha, Suresh B, Ingo Molnar, Peter Williams, linux-kernel

> After that, it might be acceptable to call rebalance_tick from a tasklet,
> although it would be uneeded overhead on small systems. It might be better

Here is a patch that only runs rebalance tick from a tasklet in the 
scheduler. For UP there will be no tasklet.

[RFC] sched_rebalance_tick with interrupts enabled

scheduler_tick() has the potential of running for some time if f.e.
sched_domains for a system with 1024 processors have to be balanced.
We currently do all of that with interrupts disabled. So we may be unable
to service interrupts for some time. Most of that time is potentially
spend in rebalance_tick.

This patch splits off rebalance_tick from scheduler_tick and schedules
it via a tasklet.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

Index: linux-2.6.19-rc2-mm1/kernel/sched.c
===================================================================
--- linux-2.6.19-rc2-mm1.orig/kernel/sched.c	2006-10-18 12:28:03.000000000 -0500
+++ linux-2.6.19-rc2-mm1/kernel/sched.c	2006-10-18 12:57:04.848077764 -0500
@@ -2831,13 +2831,37 @@ static inline unsigned long cpu_offset(i
 	return jiffies + cpu * HZ / NR_CPUS;
 }
 
-static void
-rebalance_tick(int this_cpu, struct rq *this_rq, enum idle_type idle)
+static inline int wake_priority_sleeper(struct rq *rq)
+{
+	int ret = 0;
+
+#ifdef CONFIG_SCHED_SMT
+	spin_lock(&rq->lock);
+	/*
+	 * If an SMT sibling task has been put to sleep for priority
+	 * reasons reschedule the idle task to see if it can now run.
+	 */
+	if (rq->nr_running) {
+		resched_task(rq->idle);
+		ret = 1;
+	}
+	spin_unlock(&rq->lock);
+#endif
+	return ret;
+}
+
+
+static void rebalance_tick(unsigned long dummy)
 {
+	int this_cpu = smp_processor_id();
 	unsigned long this_load, interval, j = cpu_offset(this_cpu);
 	struct sched_domain *sd;
 	int i, scale;
+	struct rq *this_rq = cpu_rq(this_cpu);
+	enum idle_type idle = NOT_IDLE;
 
+	if (current == this_rq->idle &&!wake_priority_sleeper(this_rq))
+		idle = SCHED_IDLE;
 	this_load = this_rq->raw_weighted_load;
 
 	/* Update our load: */
@@ -2882,37 +2906,18 @@ rebalance_tick(int this_cpu, struct rq *
 		}
 	}
 }
+
+static DECLARE_TASKLET(rebalance, rebalance_tick, 0);
+
 #else
 /*
  * on UP we do not need to balance between CPUs:
  */
-static inline void rebalance_tick(int cpu, struct rq *rq, enum idle_type idle)
-{
-}
 static inline void idle_balance(int cpu, struct rq *rq)
 {
 }
 #endif
 
-static inline int wake_priority_sleeper(struct rq *rq)
-{
-	int ret = 0;
-
-#ifdef CONFIG_SCHED_SMT
-	spin_lock(&rq->lock);
-	/*
-	 * If an SMT sibling task has been put to sleep for priority
-	 * reasons reschedule the idle task to see if it can now run.
-	 */
-	if (rq->nr_running) {
-		resched_task(rq->idle);
-		ret = 1;
-	}
-	spin_unlock(&rq->lock);
-#endif
-	return ret;
-}
-
 DEFINE_PER_CPU(struct kernel_stat, kstat);
 
 EXPORT_PER_CPU_SYMBOL(kstat);
@@ -3056,12 +3061,8 @@ void scheduler_tick(void)
 
 	rq->timestamp_last_tick = now;
 
-	if (p == rq->idle) {
-		if (wake_priority_sleeper(rq))
-			goto out;
-		rebalance_tick(cpu, rq, SCHED_IDLE);
-		return;
-	}
+	if (p == rq->idle)
+		goto out;
 
 	/* Task might have expired already, but not scheduled off yet */
 	if (p->array != rq->active) {
@@ -3135,7 +3136,9 @@ void scheduler_tick(void)
 out_unlock:
 	spin_unlock(&rq->lock);
 out:
-	rebalance_tick(cpu, rq, NOT_IDLE);
+#ifdef CONFIG_SMP
+	tasklet_schedule(&rebalance);
+#endif
 }
 
 #ifdef CONFIG_SCHED_SMT

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] sched_tick with interrupts enabled
  2006-10-18 18:01   ` Christoph Lameter
@ 2006-10-18 18:09     ` Nick Piggin
  2006-10-18 18:48       ` Christoph Lameter
  0 siblings, 1 reply; 11+ messages in thread
From: Nick Piggin @ 2006-10-18 18:09 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Siddha, Suresh B, Ingo Molnar, Peter Williams, linux-kernel

Christoph Lameter wrote:
>>After that, it might be acceptable to call rebalance_tick from a tasklet,
>>although it would be uneeded overhead on small systems. It might be better
> 
> 
> Here is a patch that only runs rebalance tick from a tasklet in the 
> scheduler. For UP there will be no tasklet.
> 
> [RFC] sched_rebalance_tick with interrupts enabled
> 
> scheduler_tick() has the potential of running for some time if f.e.
> sched_domains for a system with 1024 processors have to be balanced.
> We currently do all of that with interrupts disabled. So we may be unable
> to service interrupts for some time. Most of that time is potentially
> spend in rebalance_tick.
> 
> This patch splits off rebalance_tick from scheduler_tick and schedules
> it via a tasklet.
> 
> Signed-off-by: Christoph Lameter <clameter@sgi.com>

wake_priority_sleeper should not be called from rebalance_tick. That
code was OK where it was before, I think.

And you need to now turn off interrupts when doing the locking in
load_balance.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] sched_tick with interrupts enabled
  2006-10-18 18:09     ` Nick Piggin
@ 2006-10-18 18:48       ` Christoph Lameter
  2006-10-18 19:14         ` Nick Piggin
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Lameter @ 2006-10-18 18:48 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Siddha, Suresh B, Ingo Molnar, Peter Williams, linux-kernel

On Thu, 19 Oct 2006, Nick Piggin wrote:

> wake_priority_sleeper should not be called from rebalance_tick. That
> code was OK where it was before, I think.

wake_priority_sleeper() is necessary to establish the idle state.

> And you need to now turn off interrupts when doing the locking in
> load_balance.

Hmmm.. yes requires to review all the locks. sigh.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] sched_tick with interrupts enabled
  2006-10-18 18:48       ` Christoph Lameter
@ 2006-10-18 19:14         ` Nick Piggin
  2006-10-18 21:59           ` Christoph Lameter
  0 siblings, 1 reply; 11+ messages in thread
From: Nick Piggin @ 2006-10-18 19:14 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Siddha, Suresh B, Ingo Molnar, Peter Williams, linux-kernel

Christoph Lameter wrote:
> On Thu, 19 Oct 2006, Nick Piggin wrote:
> 
> 
>>wake_priority_sleeper should not be called from rebalance_tick. That
>>code was OK where it was before, I think.
> 
> 
> wake_priority_sleeper() is necessary to establish the idle state.

No it isn't. It's just that if it returns nonzero, we know we are not
idle so we needn't test for that again.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] sched_tick with interrupts enabled
  2006-10-18 19:14         ` Nick Piggin
@ 2006-10-18 21:59           ` Christoph Lameter
  2006-10-19  2:19             ` Siddha, Suresh B
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Lameter @ 2006-10-18 21:59 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Siddha, Suresh B, Ingo Molnar, Peter Williams, linux-kernel

Revised and tested patch. Be sure to disable interruipts in
load_balance. Fix the wait_priority_sleeper issue.

[PATCH] rebalance_tick() with interrupts enabled

load_balancing has the potential of running for some time if f.e.
sched_domains for a system with 1024 processors have to be balanced.
We currently do all of that with interrupts disabled. So we may be unable
to service interrupts for some time. Most of that time is potentially
spend in rebalance_tick.

This patch splits off rebalance_tick from scheduler_tick and schedules
it via a tasklet.

Successfully completed an AIM7 run with it.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

Index: linux-2.6.19-rc2-mm1/kernel/sched.c
===================================================================
--- linux-2.6.19-rc2-mm1.orig/kernel/sched.c	2006-10-18 15:45:27.921708469 -0500
+++ linux-2.6.19-rc2-mm1/kernel/sched.c	2006-10-18 15:58:58.255149369 -0500
@@ -2531,7 +2531,7 @@ static inline unsigned long minus_1_or_z
  * Check this_cpu to ensure it is balanced within domain. Attempt to move
  * tasks if there is an imbalance.
  *
- * Called with this_rq unlocked.
+ * Called with this_rq unlocked. Interrupts may be enabled.
  */
 static int load_balance(int this_cpu, struct rq *this_rq,
 			struct sched_domain *sd, enum idle_type idle)
@@ -2541,6 +2541,7 @@ static int load_balance(int this_cpu, st
 	unsigned long imbalance;
 	struct rq *busiest;
 	cpumask_t cpus = CPU_MASK_ALL;
+	unsigned long flags;
 
 	/*
 	 * When power savings policy is enabled for the parent domain, idle
@@ -2580,11 +2581,13 @@ redo:
 		 * still unbalanced. nr_moved simply stays zero, so it is
 		 * correctly treated as an imbalance.
 		 */
+		local_irq_save(flags);
 		double_rq_lock(this_rq, busiest);
 		nr_moved = move_tasks(this_rq, this_cpu, busiest,
 				      minus_1_or_zero(busiest->nr_running),
 				      imbalance, sd, idle, &all_pinned);
 		double_rq_unlock(this_rq, busiest);
+		local_irq_restore(flags);
 
 		/* All tasks on this runqueue were pinned by CPU affinity */
 		if (unlikely(all_pinned)) {
@@ -2601,13 +2604,13 @@ redo:
 
 		if (unlikely(sd->nr_balance_failed > sd->cache_nice_tries+2)) {
 
-			spin_lock(&busiest->lock);
+			spin_lock_irqsave(&busiest->lock, flags);
 
 			/* don't kick the migration_thread, if the curr
 			 * task on busiest cpu can't be moved to this_cpu
 			 */
 			if (!cpu_isset(this_cpu, busiest->curr->cpus_allowed)) {
-				spin_unlock(&busiest->lock);
+				spin_unlock_irqrestore(&busiest->lock, flags);
 				all_pinned = 1;
 				goto out_one_pinned;
 			}
@@ -2617,7 +2620,7 @@ redo:
 				busiest->push_cpu = this_cpu;
 				active_balance = 1;
 			}
-			spin_unlock(&busiest->lock);
+			spin_unlock_irqrestore(&busiest->lock, flags);
 			if (active_balance)
 				wake_up_process(busiest->migration_thread);
 
@@ -2831,13 +2834,19 @@ static inline unsigned long cpu_offset(i
 	return jiffies + cpu * HZ / NR_CPUS;
 }
 
-static void
-rebalance_tick(int this_cpu, struct rq *this_rq, enum idle_type idle)
+/*
+ * Called from a tasklet with interrupts enabled.
+ */
+static void rebalance_tick(unsigned long dummy)
 {
+	int this_cpu = smp_processor_id();
+	struct rq *this_rq = cpu_rq(this_cpu);
+	enum idle_type idle;
 	unsigned long this_load, interval, j = cpu_offset(this_cpu);
 	struct sched_domain *sd;
 	int i, scale;
 
+	idle = (current == this_rq->idle) ? SCHED_IDLE : NOT_IDLE;
 	this_load = this_rq->raw_weighted_load;
 
 	/* Update our load: */
@@ -2882,35 +2891,30 @@ rebalance_tick(int this_cpu, struct rq *
 		}
 	}
 }
+
+static DECLARE_TASKLET(rebalance, rebalance_tick, 0);
+
 #else
 /*
  * on UP we do not need to balance between CPUs:
  */
-static inline void rebalance_tick(int cpu, struct rq *rq, enum idle_type idle)
-{
-}
 static inline void idle_balance(int cpu, struct rq *rq)
 {
 }
 #endif
 
-static inline int wake_priority_sleeper(struct rq *rq)
+static inline void wake_priority_sleeper(struct rq *rq)
 {
-	int ret = 0;
-
 #ifdef CONFIG_SCHED_SMT
 	spin_lock(&rq->lock);
 	/*
 	 * If an SMT sibling task has been put to sleep for priority
 	 * reasons reschedule the idle task to see if it can now run.
 	 */
-	if (rq->nr_running) {
+	if (rq->nr_running)
 		resched_task(rq->idle);
-		ret = 1;
-	}
 	spin_unlock(&rq->lock);
 #endif
-	return ret;
 }
 
 DEFINE_PER_CPU(struct kernel_stat, kstat);
@@ -3057,10 +3061,8 @@ void scheduler_tick(void)
 	rq->timestamp_last_tick = now;
 
 	if (p == rq->idle) {
-		if (wake_priority_sleeper(rq))
-			goto out;
-		rebalance_tick(cpu, rq, SCHED_IDLE);
-		return;
+		wake_priority_sleeper(rq);
+		goto out;
 	}
 
 	/* Task might have expired already, but not scheduled off yet */
@@ -3135,7 +3137,9 @@ void scheduler_tick(void)
 out_unlock:
 	spin_unlock(&rq->lock);
 out:
-	rebalance_tick(cpu, rq, NOT_IDLE);
+#ifdef CONFIG_SMP
+	tasklet_schedule(&rebalance);
+#endif
 }
 
 #ifdef CONFIG_SCHED_SMT


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] sched_tick with interrupts enabled
  2006-10-18 21:59           ` Christoph Lameter
@ 2006-10-19  2:19             ` Siddha, Suresh B
  2006-10-19 10:16               ` KAMEZAWA Hiroyuki
  2006-10-19 15:50               ` Christoph Lameter
  0 siblings, 2 replies; 11+ messages in thread
From: Siddha, Suresh B @ 2006-10-19  2:19 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Nick Piggin, Siddha, Suresh B, Ingo Molnar, Peter Williams,
	linux-kernel

On Wed, Oct 18, 2006 at 02:59:07PM -0700, Christoph Lameter wrote:
> load_balancing has the potential of running for some time if f.e.
> sched_domains for a system with 1024 processors have to be balanced.
> We currently do all of that with interrupts disabled. So we may be unable
> to service interrupts for some time. Most of that time is potentially
> spend in rebalance_tick.

Did you see an issue because of this or just theoretical?

> +static void rebalance_tick(unsigned long dummy)
>  {
> +	int this_cpu = smp_processor_id();
> +	struct rq *this_rq = cpu_rq(this_cpu);
> +	enum idle_type idle;
>  	unsigned long this_load, interval, j = cpu_offset(this_cpu);
>  	struct sched_domain *sd;
>  	int i, scale;
>  
> +	idle = (current == this_rq->idle) ? SCHED_IDLE : NOT_IDLE;

We need to add nr_running check too, to determine if we want to do
idle/not-idle balancing. This is in the context of wake_priority_sleeper()

thanks,
suresh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] sched_tick with interrupts enabled
  2006-10-19  2:19             ` Siddha, Suresh B
@ 2006-10-19 10:16               ` KAMEZAWA Hiroyuki
  2006-10-19 15:50               ` Christoph Lameter
  1 sibling, 0 replies; 11+ messages in thread
From: KAMEZAWA Hiroyuki @ 2006-10-19 10:16 UTC (permalink / raw)
  To: Siddha, Suresh B
  Cc: clameter, nickpiggin, suresh.b.siddha, mingo, pwil3058,
	linux-kernel, kaneshige.kenji@soft.fujitsu.com

On Wed, 18 Oct 2006 19:19:00 -0700
"Siddha, Suresh B" <suresh.b.siddha@intel.com> wrote:

> On Wed, Oct 18, 2006 at 02:59:07PM -0700, Christoph Lameter wrote:
> > load_balancing has the potential of running for some time if f.e.
> > sched_domains for a system with 1024 processors have to be balanced.
> > We currently do all of that with interrupts disabled. So we may be unable
> > to service interrupts for some time. Most of that time is potentially
> > spend in rebalance_tick.
> 
> Did you see an issue because of this or just theoretical?
> 

IIRC, Fujitsu's 64-cpu ia64/SMP system sufferred from this issue *in older kernel*. 
Now, we avoid it by creating NUMA nodes and dividing scheduler domains.

Situation was...:
5 runnable processes which were pinned to a cpu in a 64cpu system.
the system was always rebalanced and seems to be hanged.

-Kame


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] sched_tick with interrupts enabled
  2006-10-19  2:19             ` Siddha, Suresh B
  2006-10-19 10:16               ` KAMEZAWA Hiroyuki
@ 2006-10-19 15:50               ` Christoph Lameter
  2006-10-19 20:40                 ` Siddha, Suresh B
  1 sibling, 1 reply; 11+ messages in thread
From: Christoph Lameter @ 2006-10-19 15:50 UTC (permalink / raw)
  To: Siddha, Suresh B; +Cc: Nick Piggin, Ingo Molnar, Peter Williams, linux-kernel

On Wed, 18 Oct 2006, Siddha, Suresh B wrote:

> On Wed, Oct 18, 2006 at 02:59:07PM -0700, Christoph Lameter wrote:
> > load_balancing has the potential of running for some time if f.e.
> > sched_domains for a system with 1024 processors have to be balanced.
> > We currently do all of that with interrupts disabled. So we may be unable
> > to service interrupts for some time. Most of that time is potentially
> > spend in rebalance_tick.
> 
> Did you see an issue because of this or just theoretical?

Yes we actually had some livelocks when things got really busy and even 
missed timer interrupts. We have another workaround by enabling/disabling 
interrupts  in the timer tick on IA64 now but it would be good 
to address this issue in a general way.

> > +static void rebalance_tick(unsigned long dummy)
> >  {
> > +	int this_cpu = smp_processor_id();
> > +	struct rq *this_rq = cpu_rq(this_cpu);
> > +	enum idle_type idle;
> >  	unsigned long this_load, interval, j = cpu_offset(this_cpu);
> >  	struct sched_domain *sd;
> >  	int i, scale;
> >  
> > +	idle = (current == this_rq->idle) ? SCHED_IDLE : NOT_IDLE;
> 
> We need to add nr_running check too, to determine if we want to do
> idle/not-idle balancing. This is in the context of wake_priority_sleeper()

Ok. Thanks. Would this work?

Index: linux-2.6.19-rc2-mm1/kernel/sched.c
===================================================================
--- linux-2.6.19-rc2-mm1.orig/kernel/sched.c	2006-10-19 09:39:08.000000000 -0500
+++ linux-2.6.19-rc2-mm1/kernel/sched.c	2006-10-19 09:42:10.733631242 -0500
@@ -2846,7 +2846,8 @@ static void rebalance_tick(unsigned long
 	struct sched_domain *sd;
 	int i, scale;
 
-	idle = (current == this_rq->idle) ? SCHED_IDLE : NOT_IDLE;
+	idle = (current == this_rq->idle && !this_rq->nr_running) ?
+				SCHED_IDLE : NOT_IDLE;
 	this_load = this_rq->raw_weighted_load;
 
 	/* Update our load: */

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] sched_tick with interrupts enabled
  2006-10-19 15:50               ` Christoph Lameter
@ 2006-10-19 20:40                 ` Siddha, Suresh B
  0 siblings, 0 replies; 11+ messages in thread
From: Siddha, Suresh B @ 2006-10-19 20:40 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Siddha, Suresh B, Nick Piggin, Ingo Molnar, Peter Williams,
	linux-kernel

On Thu, Oct 19, 2006 at 08:50:50AM -0700, Christoph Lameter wrote:
> Ok. Thanks. Would this work?
> 
> Index: linux-2.6.19-rc2-mm1/kernel/sched.c
> ===================================================================
> --- linux-2.6.19-rc2-mm1.orig/kernel/sched.c	2006-10-19 09:39:08.000000000 -0500
> +++ linux-2.6.19-rc2-mm1/kernel/sched.c	2006-10-19 09:42:10.733631242 -0500
> @@ -2846,7 +2846,8 @@ static void rebalance_tick(unsigned long
>  	struct sched_domain *sd;
>  	int i, scale;
>  
> -	idle = (current == this_rq->idle) ? SCHED_IDLE : NOT_IDLE;
> +	idle = (current == this_rq->idle && !this_rq->nr_running) ?
> +				SCHED_IDLE : NOT_IDLE;

A comment of why we are checking for nr_running would be nice.

And one more thing. We can reduce some of the tasklet invoking complexity
by actually checking for a load_balance() need at any domain and thus
invoking tasklet which will do the load balance, rather than unconditionally
invoking tasklet for each tick.

thanks,
suresh

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2006-10-19 21:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-18 17:04 [RFC] sched_tick with interrupts enabled Christoph Lameter
2006-10-18 17:21 ` Nick Piggin
2006-10-18 18:01   ` Christoph Lameter
2006-10-18 18:09     ` Nick Piggin
2006-10-18 18:48       ` Christoph Lameter
2006-10-18 19:14         ` Nick Piggin
2006-10-18 21:59           ` Christoph Lameter
2006-10-19  2:19             ` Siddha, Suresh B
2006-10-19 10:16               ` KAMEZAWA Hiroyuki
2006-10-19 15:50               ` Christoph Lameter
2006-10-19 20:40                 ` Siddha, Suresh B

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox