* Nick's scheduler v17 @ 2003-10-24 18:10 Nick Piggin 2003-10-24 20:17 ` cliff white 2003-10-24 21:49 ` Andrew Theurer 0 siblings, 2 replies; 7+ messages in thread From: Nick Piggin @ 2003-10-24 18:10 UTC (permalink / raw) To: linux-kernel Hi, http://www.kerneltrap.org/~npiggin/v17/ Still working on SMP and NUMA. Some (maybe) interesting things I put in are - Sequential CPU balancing so you don't get a big storm of balances every 1/4s. - Balancing is trying to err more on the side of caution, I have to start analysing it more thoroughly though. - Attacked the NUMA balancing code. There should now be less buslocked ops / cache pingpongs in some fastpaths. Volanomark likes it, more realistic loads won't improve so much http://www.kerneltrap.org/~npiggin/v17/volano.png This improvement is NUMA only. I haven't had time to reproduce Cliff's serious reaim performance dropoffs so they're probably still there. I couldn't reproduce Martin's kernbench dropoff, but the 16-way I'm using only has 512K cache which might not show it up. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Nick's scheduler v17 2003-10-24 18:10 Nick's scheduler v17 Nick Piggin @ 2003-10-24 20:17 ` cliff white 2003-10-24 21:49 ` Andrew Theurer 1 sibling, 0 replies; 7+ messages in thread From: cliff white @ 2003-10-24 20:17 UTC (permalink / raw) To: Nick Piggin; +Cc: linux-kernel On Sat, 25 Oct 2003 04:10:24 +1000 Nick Piggin <piggin@cyberone.com.au> wrote: > Hi, > http://www.kerneltrap.org/~npiggin/v17/ > > Still working on SMP and NUMA. Some (maybe) interesting things I put in are > - Sequential CPU balancing so you don't get a big storm of balances > every 1/4s. > - Balancing is trying to err more on the side of caution, I have to start > analysing it more thoroughly though. > - Attacked the NUMA balancing code. There should now be less buslocked ops / > cache pingpongs in some fastpaths. Volanomark likes it, more realistic > loads > won't improve so much http://www.kerneltrap.org/~npiggin/v17/volano.png > This improvement is NUMA only. > > I haven't had time to reproduce Cliff's serious reaim performance > dropoffs so > they're probably still there. I couldn't reproduce Martin's kernbench > dropoff, > but the 16-way I'm using only has 512K cache which might not show it up. > Okay, i put both v17 patches in the OSDL Patch Lifecycle Manager ( http://www.osdl.org/plm-cgi/plm/ ) sched-v17 is PLM #2251 sched-nopolicy-v17 is PLM #2252 Results asap. cliffw > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- The church is near, but the road is icy. The bar is far, but i will walk carefully. - Russian proverb ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Nick's scheduler v17 2003-10-24 18:10 Nick's scheduler v17 Nick Piggin 2003-10-24 20:17 ` cliff white @ 2003-10-24 21:49 ` Andrew Theurer 2003-10-25 1:12 ` Nick Piggin 1 sibling, 1 reply; 7+ messages in thread From: Andrew Theurer @ 2003-10-24 21:49 UTC (permalink / raw) To: Nick Piggin; +Cc: linux-kernel On Friday 24 October 2003 13:10, Nick Piggin wrote: > Hi, > http://www.kerneltrap.org/~npiggin/v17/ > > Still working on SMP and NUMA. Some (maybe) interesting things I put in are > - Sequential CPU balancing so you don't get a big storm of balances > every 1/4s. > - Balancing is trying to err more on the side of caution, I have to start > analysing it more thoroughly though. + + *imbalance /= 2; + *imbalance = (*imbalance + FPT - 1) / FPT; I think I see what is going on here, but would something like this work out better? *imbalance = min(this_load - load_avg, load_avg - max_load) That way you take just enough to either have busiest_queue or this_rq's length be the load_avg. I suppose you could take even less, but IMO, the /=2 is what I really don't like. Perhaps: *imbalance = min(this_load - load_avg, load_avg - max_load); *imbalance = (*imbalance + FPT - 1) / FPT; This should work well for intranode balances, internode balances may need a little optimization, since the load_avg really does not really represent the load avg of the two nodes in question, just one cpu from one of them and all the cpus from another. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Nick's scheduler v17 2003-10-24 21:49 ` Andrew Theurer @ 2003-10-25 1:12 ` Nick Piggin 2003-10-26 6:43 ` Nick Piggin 0 siblings, 1 reply; 7+ messages in thread From: Nick Piggin @ 2003-10-25 1:12 UTC (permalink / raw) To: Andrew Theurer; +Cc: linux-kernel Andrew Theurer wrote: >On Friday 24 October 2003 13:10, Nick Piggin wrote: > >>Hi, >>http://www.kerneltrap.org/~npiggin/v17/ >> >>Still working on SMP and NUMA. Some (maybe) interesting things I put in are >>- Sequential CPU balancing so you don't get a big storm of balances >>every 1/4s. >>- Balancing is trying to err more on the side of caution, I have to start >> analysing it more thoroughly though. >> > >+ >+ *imbalance /= 2; >+ *imbalance = (*imbalance + FPT - 1) / FPT; > >I think I see what is going on here, but would something like this work out >better? > Yeah, sorry its not well commented. Its still changing quite quickly. > > *imbalance = min(this_load - load_avg, load_avg - max_load) > >That way you take just enough to either have busiest_queue or this_rq's length >be the load_avg. I suppose you could take even less, but IMO, the /=2 is >what I really don't like. Perhaps: > That is _exactly_ what I had before! Thats probably the way to go. Thanks for having a look at it. > > >*imbalance = min(this_load - load_avg, load_avg - max_load); >*imbalance = (*imbalance + FPT - 1) / FPT; > >This should work well for intranode balances, internode balances may need a >little optimization, since the load_avg really does not really represent the >load avg of the two nodes in question, just one cpu from one of them and all >the cpus from another. > Yeah that does need a bit of rethinking. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Nick's scheduler v17 2003-10-25 1:12 ` Nick Piggin @ 2003-10-26 6:43 ` Nick Piggin 2003-10-27 17:02 ` Andrew Theurer 0 siblings, 1 reply; 7+ messages in thread From: Nick Piggin @ 2003-10-26 6:43 UTC (permalink / raw) To: Nick Piggin; +Cc: Andrew Theurer, linux-kernel Nick Piggin wrote: > > > Andrew Theurer wrote: > >> On Friday 24 October 2003 13:10, Nick Piggin wrote: >> >>> Hi, >>> http://www.kerneltrap.org/~npiggin/v17/ >>> >>> Still working on SMP and NUMA. Some (maybe) interesting things I put >>> in are >>> - Sequential CPU balancing so you don't get a big storm of balances >>> every 1/4s. >>> - Balancing is trying to err more on the side of caution, I have to >>> start >>> analysing it more thoroughly though. >>> >> >> + >> + *imbalance /= 2; >> + *imbalance = (*imbalance + FPT - 1) / FPT; >> >> I think I see what is going on here, but would something like this >> work out better? >> > > Yeah, sorry its not well commented. Its still changing quite quickly. > >> >> *imbalance = min(this_load - load_avg, load_avg - max_load) >> >> That way you take just enough to either have busiest_queue or >> this_rq's length be the load_avg. I suppose you could take even >> less, but IMO, the /=2 is what I really don't like. Perhaps: >> > > That is _exactly_ what I had before! Thats probably the way to go. Thanks > for having a look at it. > >> >> >> *imbalance = min(this_load - load_avg, load_avg - max_load); >> *imbalance = (*imbalance + FPT - 1) / FPT; >> >> This should work well for intranode balances, internode balances may >> need a little optimization, since the load_avg really does not really >> represent the load avg of the two nodes in question, just one cpu >> from one of them and all the cpus from another. >> Oh, actually, after my path, load_avg represents the load average of _all_ the nodes. Have a look at find_busiest_node. Which jogs my memory of why its not always a good idea to do your *imbalance min(...) thing (I actually saw this happening). 5 CPUs, 4 processes running on one cpu. load_avg would be 0.8 for all cpus. balancing doesn't happen. I have to think about this a bit more... ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Nick's scheduler v17 2003-10-26 6:43 ` Nick Piggin @ 2003-10-27 17:02 ` Andrew Theurer 2003-10-27 23:13 ` Nick Piggin 0 siblings, 1 reply; 7+ messages in thread From: Andrew Theurer @ 2003-10-27 17:02 UTC (permalink / raw) To: Nick Piggin; +Cc: linux-kernel > >> *imbalance = min(this_load - load_avg, load_avg - max_load) > >> > >> That way you take just enough to either have busiest_queue or > >> this_rq's length be the load_avg. I suppose you could take even > >> less, but IMO, the /=2 is what I really don't like. Perhaps: > > > > That is _exactly_ what I had before! Thats probably the way to go. Thanks > > for having a look at it. > > > >> *imbalance = min(this_load - load_avg, load_avg - max_load); > >> *imbalance = (*imbalance + FPT - 1) / FPT; > >> > >> This should work well for intranode balances, internode balances may > >> need a little optimization, since the load_avg really does not really > >> represent the load avg of the two nodes in question, just one cpu > >> from one of them and all the cpus from another. > > Oh, actually, after my path, load_avg represents the load average of _all_ > the nodes. Have a look at find_busiest_node. Which jogs my memory of why > its not always a good idea to do your *imbalance min(...) thing (I actually > saw this happening). Oops, I meant avg_load, which you calculate in find_busiest_queue on the fly. > 5 CPUs, 4 processes running on one cpu. load_avg would be 0.8 for all cpus. > balancing doesn't happen. I have to think about this a bit more... Actually, if we use avg_load, I guess it would be 0, since this is an unsigned long. Maybe avg_load needs to have a min value of 1. Then if we apply: *imbalance = min(max_load - avg_load, avg_load - this_load) min(4 - 1, 1 - 0) And imbalance looks a lot better. Only concern would be an idle cpu stealing from another, leaving the other cpu idle. I guess a check could be put there. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Nick's scheduler v17 2003-10-27 17:02 ` Andrew Theurer @ 2003-10-27 23:13 ` Nick Piggin 0 siblings, 0 replies; 7+ messages in thread From: Nick Piggin @ 2003-10-27 23:13 UTC (permalink / raw) To: Andrew Theurer; +Cc: linux-kernel Andrew Theurer wrote: >>>> *imbalance = min(this_load - load_avg, load_avg - max_load) >>>> >>>>That way you take just enough to either have busiest_queue or >>>>this_rq's length be the load_avg. I suppose you could take even >>>>less, but IMO, the /=2 is what I really don't like. Perhaps: >>>> >>>That is _exactly_ what I had before! Thats probably the way to go. Thanks >>>for having a look at it. >>> >>> >>>>*imbalance = min(this_load - load_avg, load_avg - max_load); >>>>*imbalance = (*imbalance + FPT - 1) / FPT; >>>> >>>>This should work well for intranode balances, internode balances may >>>>need a little optimization, since the load_avg really does not really >>>>represent the load avg of the two nodes in question, just one cpu >>>>from one of them and all the cpus from another. >>>> >>Oh, actually, after my path, load_avg represents the load average of _all_ >>the nodes. Have a look at find_busiest_node. Which jogs my memory of why >>its not always a good idea to do your *imbalance min(...) thing (I actually >>saw this happening). >> > >Oops, I meant avg_load, which you calculate in find_busiest_queue on the fly. > OK > >>5 CPUs, 4 processes running on one cpu. load_avg would be 0.8 for all cpus. >>balancing doesn't happen. I have to think about this a bit more... >> > >Actually, if we use avg_load, I guess it would be 0, since this is an unsigned >long. Maybe avg_load needs to have a min value of 1. Then if we apply: > Well its got a fixed point scaling factor. > >*imbalance = min(max_load - avg_load, avg_load - this_load) > min(4 - 1, 1 - 0) > > I think you want: *imbalance = min(max_load - avg_load, avg_load - this_load) if ( (*imbalance < 1*FPT) && (max_load - this_load) > 1*FPT ) *imbalance = 1*FPT; So if there is a total imbalance of more than 1 task, at least one will be moved. > >And imbalance looks a lot better. Only concern would be an idle cpu stealing >from another, leaving the other cpu idle. I guess a check could be put >there. > > pull_task won't pull a running task, so you get some protection there. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2003-10-27 23:15 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-10-24 18:10 Nick's scheduler v17 Nick Piggin 2003-10-24 20:17 ` cliff white 2003-10-24 21:49 ` Andrew Theurer 2003-10-25 1:12 ` Nick Piggin 2003-10-26 6:43 ` Nick Piggin 2003-10-27 17:02 ` Andrew Theurer 2003-10-27 23:13 ` Nick Piggin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox