public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* help? converting to single global prio_array in scheduler, ran into snag
@ 2006-04-05 16:54 Christopher Friesen
  2006-04-05 17:23 ` Christopher Friesen
  0 siblings, 1 reply; 4+ messages in thread
From: Christopher Friesen @ 2006-04-05 16:54 UTC (permalink / raw)
  To: linux-kernel


We're having some issues with the load balancer algorithm in CKRM, so 
due to time pressure I'm looking at converting the scheduler to use a 
single global prio_array rather than the per-cpu ones that it currently 
uses.  I realize we're going to take a hit, but we don't have too many 
cpus so I'm hoping it won't be too bad.

So far I've removed arrays/expired/active from the runqueue and made 
them global, added a new spinlock to protect the global list (always 
taken after the runqueue lock), and converted all the callers to use the 
appropriate variable.  All changes were in sched.h and sched.c.

This builds for both UP and SMP, boots for UP, and boots for SMP if I 
set the "nosmp" boot arg.

Unfortunately I seem to have missed something. On my Mac G5 if I allow 
it to use both cpus it gets to "smp_core99_setup_cpu 0 done", then hangs.

Anyone have any suggestions as to what I should look at?  Maybe the idle 
task initialization?

Thanks,

Chris


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: help? converting to single global prio_array in scheduler, ran into snag
  2006-04-05 16:54 help? converting to single global prio_array in scheduler, ran into snag Christopher Friesen
@ 2006-04-05 17:23 ` Christopher Friesen
  2006-04-06  3:34   ` Darren Hart
  0 siblings, 1 reply; 4+ messages in thread
From: Christopher Friesen @ 2006-04-05 17:23 UTC (permalink / raw)
  To: linux-kernel

I should clarify that CKRM is currently disabled--I'm trying to get the 
vanilla scheduler working first before changing the CKRM stuff to use 
per-class prio arrays rather than per-class per-cpu ones.

Chris

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: help? converting to single global prio_array in scheduler, ran into snag
  2006-04-05 17:23 ` Christopher Friesen
@ 2006-04-06  3:34   ` Darren Hart
  2006-04-06 16:08     ` Christopher Friesen
  0 siblings, 1 reply; 4+ messages in thread
From: Darren Hart @ 2006-04-06  3:34 UTC (permalink / raw)
  To: Christopher Friesen; +Cc: linux-kernel

On Wednesday 05 April 2006 10:23, you wrote:
> I should clarify that CKRM is currently disabled--I'm trying to get the
> vanilla scheduler working first before changing the CKRM stuff to use
> per-class prio arrays rather than per-class per-cpu ones.
>

First thing that comes to mind, did you look for every place that accesses the 
arrays via the rq->lock and make it use the new global array_lock?  It would 
help if you would post your initial patch for review (designating it as RFC, 
not intended for inclusion).

(Chris, sorry for the duplicate, forgot to cc the list first time around)

Thanks,

--Darren

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: help? converting to single global prio_array in scheduler, ran into snag
  2006-04-06  3:34   ` Darren Hart
@ 2006-04-06 16:08     ` Christopher Friesen
  0 siblings, 0 replies; 4+ messages in thread
From: Christopher Friesen @ 2006-04-06 16:08 UTC (permalink / raw)
  To: Darren Hart; +Cc: linux-kernel

Darren Hart wrote:

> First thing that comes to mind, did you look for every place that accesses the 
> arrays via the rq->lock and make it use the new global array_lock?

Yep.  All places where any of "arrays[i]", "expired", or "active" were 
accessed are now protected (as far as I can tell) by the new lock.

I'm just wondering if there are any "gotchas" that jump out at people 
based on what I'm trying to do, or if it should just be a matter of 
changing the data structures and getting the locking right.  It's only 
when I try to run with multiple cpus that it breaks, so either there's 
something wrong in the initialization of the second cpu or else it's a 
locking issue.

When I let it use both cpus I get partway through kernel initialization, 
then it hangs.  Adding instrumentation lets me get further in, which 
makes me suspect some kind of race condition.

> It would 
> help if you would post your initial patch for review (designating it as RFC, 
> not intended for inclusion).

Unfortunately my patch is against a heavily modified version of the 
kernel, so I'm not sure how useful it would be.  I suppose I could redo 
it against a vanilla version of 2.6.10, but that would take some time. 
If you think it would be useful I could certainly do it.

Chris

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-04-06 16:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-05 16:54 help? converting to single global prio_array in scheduler, ran into snag Christopher Friesen
2006-04-05 17:23 ` Christopher Friesen
2006-04-06  3:34   ` Darren Hart
2006-04-06 16:08     ` Christopher Friesen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox