Jesse Barnes wrote: > On Friday, July 16, 2004 1:53 am, Nick Piggin wrote: >>Instead of a top level domain spanning all CPUs, have each CPU's top level >>domain just span all CPUs within a couple of hops (enough to get, say 16 to >>64 CPUs into each top level domain). I could give you a hand with this if >>you need. > > > Yeah, that's what I had in mind. I'll wait for the patch you mentioned above > and hack on top of that... > The patch is attached, although it needs a bit of commenting and testing. Also, the init_sched_build_groups helper function in kernel/sched.c probably wants to be exported for use by architecture code. Out of interest, what sort of performance problems are you seeing with this high rate of global balancing? I have a couple of patches to cut down runqueue locking to almost zero in interrupt paths, although I imagine the main problem you are having is pulling a cacheline off every remote CPU when calculating runqueue loads?