Re: fair group scheduler not so fair?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
To: "Chris Friesen" <cfriesen@nortel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"Li, Tong N" <tong.n.li@intel.com>,
	linux-kernel@vger.kernel.org, mingo@elte.hu, pj@sgi.com
Subject: Re: fair group scheduler not so fair?
Date: Fri, 23 May 2008 13:14:31 +0530	[thread overview]
Message-ID: <20080523074431.GJ3780@linux.vnet.ibm.com> (raw)
In-Reply-To: <48360D21.9060102@nortel.com>

On Thu, May 22, 2008 at 06:17:37PM -0600, Chris Friesen wrote:
> Peter Zijlstra wrote:
>
>> Given the following:
>>       root
>>      / | \
>>    _A_ 1  2
>>   /| |\
>>  3 4 5 B
>>       / \
>>      6   7
>>      CPU0            CPU1
>>      root            root
>>      /  \            /  \
>>     A    1          A    2
>>    / \             / \
>>   4   B           3   5
>>      / \
>>     6   7
>
> How do you move specific groups to different cpus.  Is this simply using 
> cpusets?

No. Moving groups to different cpus is just a group-aware extension to
move_tasks() that is invoked as part of regular load balance operation.
move_tasks()->sched_fair_class.load_balance() has been modified to
understand how much various task-groups at various levels (ex: A at level 1,
B at level 2 etc) contribute to cpu load. It moves tasks between cpus
using this knowledge.

For ex: if we were to consider all tasks shown above to be in same cpu,
CPU0, this is how it would look:

	CPU0		CPU1
       root		root
      / | \
     A  1  2
   /| |\
  3 4 5 B
       / \
      6   7

Then cpu0 load = weight of A + weight of 1 + weight of 2
	       = 1024 + 1024 + 1024 = 3072

while cpu1 load = 0

load to be moved to cut down this imbalance = 3072/2 = 1536

move_tasks() running on CPU1 would try to pull iteratively tasks such
that total weight moved is <= 1536.

	Task moved		Total Weight moved
	---------		------------
	    2			     1024
	    3			     1024 + 256 = 1280
	    5			     1280 + 256 = 1536

resulting in:

      CPU0            CPU1
      root            root
      /  \            /  \
     A    1          A    2
    / \             / \
   4   B           3   5
      / \
     6   7

>> Numerical examples given the above scenario, assuming every body's
>> weight is 1024:
>
>>  s_(0,A) = s_(1,A) = 512
>
> Just to make sure I understand what's going on...this is half of 1024 
> because it shows up on both cpus?

not exactly ..as Peter put it:

  s_(i,g) = W_g * rw_(i,g) / \Sum_j rw_(j,g)

In this case, 

  s_(0,A) = W_A * rw_(0, A) / \Sum_j rw_(j, A)

W_A = shares given to A by admin = 1024

rw_(0,A) = Weight of 4 + Weight of B = 1024 + 1024 = 2048
rw_(1,A) = Weight of 3 + Weight of 5 = 1024 + 1024 = 2048
\Sum_j rw_(j, A) = 4096

So,

  s_(0,A) = 1024 *2048 / 4096 = 512


>>  s_(0,B) = 1024, s_(1,B) = 0
>
> This gets the full 1024 because it's only on one cpu.

Not exactly. rw_(0, B) = \Sum_j rw_(j, B) and that's why s_(0,B) = 1024

>>  rw_(0,A) = rw(1,A) = 2048
>>  rw_(0,B) = 2048, rw_(1,B) = 0
>
> How do we get 2048?  Shouldn't this be 1024?

Hope this is clarified from above.

>>  h_load_(0,A) = h_load_(1,A) = 512
>>  h_load_(0,B) = 256, h_load(1,B) = 0
>
> At this point the numbers make sense, but I'm not sure how the formula for 
> h_load_ works given that I'm not sure what's going on for rw_.

-- 
Regards,
vatsa

next prev parent reply	other threads:[~2008-05-23  7:35 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-21 23:59 fair group scheduler not so fair? Chris Friesen
2008-05-22  6:56 ` Peter Zijlstra
2008-05-22 20:02   ` Chris Friesen
2008-05-22 20:07     ` Peter Zijlstra
2008-05-22 20:18       ` Li, Tong N
2008-05-22 21:13         ` Peter Zijlstra
2008-05-23  0:17           ` Chris Friesen
2008-05-23  7:44             ` Srivatsa Vaddagiri [this message]
2008-05-23  9:42         ` Srivatsa Vaddagiri
2008-05-23  9:39           ` Peter Zijlstra
2008-05-23 10:19             ` Srivatsa Vaddagiri
2008-05-23 10:16               ` Peter Zijlstra
2008-05-27 17:15 ` Srivatsa Vaddagiri
2008-05-27 18:13   ` Chris Friesen
2008-05-28 16:33     ` Srivatsa Vaddagiri
2008-05-28 18:35       ` Chris Friesen
2008-05-28 18:47         ` Dhaval Giani
2008-05-29  2:50         ` Srivatsa Vaddagiri
2008-05-29 16:46         ` Srivatsa Vaddagiri
2008-05-29 16:47           ` Srivatsa Vaddagiri
2008-05-29 21:30           ` Chris Friesen
2008-05-30  6:43             ` Dhaval Giani
2008-05-30 10:21               ` Srivatsa Vaddagiri
2008-05-30 11:36             ` Srivatsa Vaddagiri
2008-06-02 20:03               ` Chris Friesen
2008-05-27 17:28 ` Srivatsa Vaddagiri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080523074431.GJ3780@linux.vnet.ibm.com \
    --to=vatsa@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=cfriesen@nortel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=pj@sgi.com \
    --cc=tong.n.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.